The effect of disfluencies and learner errors on the parsing of spoken learner language

نویسندگان

  • Andrew Caines
  • Paula Buttery
چکیده

NLP tools are typically trained on written data from native speakers. However, research into language acquisition and tools for language teaching & proficiency assessment would benefit from accurate processing of spoken data from second language learners. In this paper we discuss manual annotation schemes for various features of spoken language; we also evaluate the automatic tagging of one particular feature (filled pauses) – finding a success rate of 81%; and we evaluate the effect of using our manual annotations to ‘clean up’ the transcriptions for sentence parsing, resulting in a 25% improvement in parse success rate by completely cleaning the texts of disfluencies and errors. We discuss the need to adapt existing NLP technology to non-canonical domains such as spoken learner language, while emphasising the worth of continued integration of manual and automatic annotation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Dependency Parsing and Disfluency Detection in Spoken Learner English

This paper investigates the suitability of state-of-the-art natural language processing (NLP) tools for parsing the spoken language of second language learners of English. The task of parsing spoken learner-language is important to the domains of automated language assessment (ALA) and computer-assisted language learning (CALL). Due to the non-canonical nature of spoken language (containing fil...

متن کامل

Second language Writing Through Blogs: An Investigation of Learner Autonomy

Employing an explanatory sequential design, the present study investigated the effect of English as a Foreign Language (EFL) blog-mediated writing instruction on the students’ learner autonomy. A number of 46 learners who were the students of two intact classes were randomly assigned to control and experimental groups.  Over a 16-week semester, the control group students (n = 21) were taught ba...

متن کامل

LEARNER INITIATIVES ACROSS QUESTION-ANSWER SEQUENCES: A CONVERSATION ANALYTIC ACCOUNT OF LANGUAGE CLASSROOM DISCOURSE

This paper investigates learner-initiated responses to English language teachers’ referential questions and learner initiatives after teachers’ feedback moves in meaning-focused question-answer sequences to analyze how interactional practices of language teachers, their initiation and feedback moves, facilitate learner initiatives. Classroom discourse research has largely neglected learner init...

متن کامل

Mediating Role of Identity Styles and Learner Autonomy in Writing Ability

This study investigates the relation between EFL (English as a foreign language) learners’ autonomy, their identity styles, and their writing ability and it aims to show which independent variables have higher predicting power on variances in writing. To this end, 60 Iranian university EFL students at the language center of the researchers’ institution were selected to participate in this study...

متن کامل

Teacher and Learner in Humanistic Language Teaching

Since ‘the development of whole person’ was brought to the focus of attention by humanist psychologists as a central concern in educational theory, affective variables have been assumed to have a significant share in the learning process that goes on in a pedagogical setting. Meanwhile, the process of second language development, because of the very nature of language as a vehicle for communica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014